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Abstract . This paper describes a model-based fault-detection and diagnosis system based on a distributed system 
identification approach. The diagnostic system consists of a two level process including parallel hypothesis testing 
modules and a fault mode identification and estimation module. The proposed system is part of a distributed diagnostic 
system for use in an intelligent control system. The proposed approach utilizes a piecewise linear model to predict the 
system performance. The deviation between predicted and actual performance is used to identify the associated fault 
mode. Each hypothesis testing module is associated with a particular class of fault modes and can be viewed as a 
condition monitor in a distributed diagnostic system hierarchy. The results of the hypothesis modules are processed 
by the fault-detection and estimation module. Using the results of the on-line diagnosis, the intelligent control system 
will be able to accommodate the fault modes, reduce maintenance cost, and increase system availability. 


INTRODUCTION 

There is a growing demand to improve the control of 
systems for enhanced performance with increased reliability, 
durability and maintainability. This demand can be met by 
improving the individual reliability of system components and 
also by an intelligent control system with fault-detection, 
diagnostics and accommodation capabilities [1,2]. This paper 
focuses on the development of a model-based fault-detection 
and diagnosis (FDD) system which can be used as an integral 
part of such an intelligent control system. 

During the last two decades of the development of fault- 
detection methods, the so called model-based fault-detection 
approach has received considerable attention [3, 4, 5, 6]. These 
schemes basically rely on the idea of analytical redundancy. 
As opposed to physical redundancy which uses measurements 
from redundant sensors for fault-detection purposes, analytical 
redundancy is based on the signals generated by the mathemati- 
cal model of the system being considered. These signals are 
then compared with the actual measurements obtained from the 
system. The residual quantities are generated by comparing the 
measured and the model -generated signals. Hence, the model- 
based fault-detection and diagnosis is defined as the determina- 
tion of faults of a system from the comparison of the measure- 
ments of the system with a priori information represented by 
the model of the system. 

A fault is defined as a malfunction that deteriorates a 
plant’s ability to perform its specified tasks. Since the faults 
alter the system dynamics, they can be modelled as changes in 
the system’s parameters. The fault-detection task is the act of 
identifying the existence of these changes. The fault diagnosis 
task is the act of isolating and estimating the magnitude of the 
fault. The basis for the isolation of a fault is the fault signa- 
ture, i.e. a signal obtained from a diagnostic model defining the 
effects associated with a class of faults. A diagnostic model is 
obtained by defining the residual vector in such a manner that 
its direction is associated with known fault signatures. Further- 
more, each signature has to be unique to one fault in order to 


accomplish fault isolation. A set of parity relations [3] or a set 
of unknown input observers [4], each assigned to be sensitive 
to a different fault, can be used for this purpose. 

The organization of this paper is as follows. First, the 
method of modelling a complex system will be described. This 
is followed by a description of diagnosis models which include 
process faults. Next, the architecture for fault-detection and 
diagnosis is described. Finally, simulation results of fault 
diagnosis of the Space Shuttle Main Engine (SSME) are given. 

PROCESS MODEL 

The nominal condition of a process under study can be 
modelled as a discrete, linear, time -invariant system described 
by: 

x(n+l) = A x(n) + B u(n) (1) 

y(n) = C x(n) 

where x is the state vector, u is the input vector and y is the 
output vector. 

The matrices A, B, and C of this model can be deter- 
mined by using a multivariable system identification technique. 
A system identification algorithm, developed in [7] to deter- 
mine these parameters based on the observability indices of the 
system from the measurements of the input and output data, 
was used in this paper. The A, B, C matrices obtained for this 
model will be used as baseline process parameters of the 
system. Any changes of these parameters observed through 
real time identification, away from preselected threshold values 
are used to detect and diagnose the faults. 

Furthermore, if the system is to be operated over a wide 
range and a linear model can not accurately represent the 
system characteristics then a series of parameter identifications 
will be needed to cover the possible range of operation 
conditions. A piecewise linear model which links all the 


l 


operation conditions can be described by: 

x(m 1) - A(y § ) x(n) + B(y a ) u(n) 
y(n) = C(y # ) x(n) 

where y s is the scheduling variable and is a subset of the output 
measurement y. 


System Degradation Model 

In the case of system performance degradation, it is 
assumed that only the system matrix A will be affected. The 
new system matrix under this fault condition becomes A,. In 
general, the fault model can be represented by: 

A f = A + AA (7) 


MODELLING THE PROCESS FAULTS 

In general, there are three classes of fault modes 
covered by the system performance model of equation (1), 
namely actuator faults, sensor faults and system performance 
degradation. In this study, actuator faults are modelled by the 
changes of actuation gain matrix B. Sensor faults are modelled 
by the changes of observation matrix C. And, system perfor- 
mance degradations (dynamic changes) are modelled by the 
system characteristic matrix A. Under these assumptions, these 
fault modes can be isolated and diagnosed by analyzing the 
observed behavior through hypothesis testing which will be 
described latter. 

Actuator Fault Model 

An actuator fault occurs when the actuator output cannot 
follow the command signal. The eiror can be either multiplica- 
tive or additive. It can be described by the following equation: 

u^n) = F . u c(°) 4 


where AA is a matrix representing the effect of the fault mode 
under study. The determination of the elements of AA requires 
the analysis of the system using a physical model or empirical 
data. 

The process model of a system with performance 
degradation becomes: 

x(n+l) = (A + AA) x(n) + B u(n) /gi 

y(n) = C x(n) 

We now define F # , f JO , F 0 , f 0O and AA as fault parame- 
ters. The following section describes the strategy of detecting 
the fault and estimating the fault parameters using a distributed 
on-line parameter identification scheme. 

For a complete model that describes all three possible 
classes of faults the system equation will be: 

x(n+l) = (A + AA) x(n) + BF b u c (n) + Bf >o ^ 

y if (n) - F # C x(n) + f o 


where u af (n) is the actual system input under the actuator fault 
condition and u c (n) is the commanded system input. F t is a 
diagonal matrix representing the multiplicative distortion of the 
command signal and f ao is a constant vector representing the 
bias, both with appropriate dimensions, 

During normal operation, F, = I and f BO = 0. Different 
fault modes will result in different values of F, and f so . The 
values of F B and f ao will be estimated and used to identify the 
corresponding fault modes. 

By replacing the input signal u in equation (1) with the 
actual input signal u af , a model for the system with the actuator 
faults is obtained. 

x(n+l) = A x(n) + B Fu c (n) + B f <o ^ 

y(n) = C x(n) 

BF a is the new input gain matrix and Bf ao is a bias term. 
Sensor Fault Model 

Similar to the way actuator faults were handled, sensor 
faults can also be modelled as a combination of multiplicative 
and bias errors: 

y.,(n) = F y(n) , f o (5) 

where y If (n) are the sensor outputs through possible failed 
sensors and y(n) the actual process outputs. The matrix F, is 
a diagonal matrix for the multiplicative error and f <0 is a 
constant vector for the measurement bias, both with appropriate 
dimensions. During normal operation, F, = I and f IO = 0. This 
model can represent a wide range of sensor faults, such as 
calibration errors (one of the diagonal elements of F, * 1 and/or 
f.o * Q)> l° ss of signals (one of the diagonal element of F, is 0), 
drift (f ao * 0), cross wiring (F, * I) and many others. 

The system model of the process with failed sensors can 
be obtain as: 

x(n+ 1 ) = A x(n) 4 B u(n) (6) 

y„(n) = F . c x < n > + f .« 


FAULT-DETECTION AND DIAGNOSIS 

In the fault-detection and diagnosis for the system 
modelled by equation (9), one approach is to have an on-line 
estimation algorithm for all fault parameters in the equation. 
The estimated fault parameters can be compared to the prede- 
termined signature of the fault modes of different classes. This 
approach is difficult in estimating many fault parameters at the 
same time. Also, the signatures of the fault parameters can be 
ambiguous if they were estimated by a single module. Thus, 
instead of direct estimation of parameter matrices A, B, C, and 
their related fault parameters, a two-step approach is proposed. 
The first step composed of a group of "Hypothesis Testing 
Modules" (HTM) in parallel processing to test each class of 
faults. Each module is solely designed to process the in- 
put/output data under a specified hypothesis and generate the 
fault signature data for diagnostics purposes. The second step 
is the fault diagnosis module which checks all the information 
obtained from the HTM level, isolates the fault, and determine 
its magnitude. Figure 1 shows the structure of the fault 
detection and diagnostic system. 

Hypothesis Testing Modules 

As illustrated in Figure 1 , there are three fault parameter 
estimation modules in the first data processing layer. These 
modules are used for on-line identification of fault parameters 
corresponding to hypothesized actuator, sensor or system faults. 
The first module process the data under the assumption of 
possible actuator faults, i.e. modelled by equation (4). The goal 
of this module is to estimate the actuator fault parameters (F. 
and f BO ) using the on-line input/output data (u c and y) assuming 
system matrices A, B and C are known. Since the fault 
parameters are the only unknown in equation (4), they can be 
estimated by a recursive on-line parameter estimation algorithm. 
Likewise, the sensor fault hypothesis testing module uses 
equation (6) and the system degeneration testing module uses 
equation (8) to estimate their fault parameters. Upon the 
estimation of the fault parameters, it is also necessary to 
determine the validity of the hypothesis. This is accomplished 
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Figure 1 Distributed Model-Based Fault Detection and Diagnostic System 


by comparing an output estimate obtained using the estimated 
fault parameters with the actual measured output. For this 
purpose the residual of the proposed model is defined as: 

e u (n) = z,(n) - y,(n/n-l, Hp (,0) 

here subscript i and j refers to the i th output and j th class of 
faults. Zj(n) is the measurement of i’th output. H, represents 
the hypothesis that the fault belongs to the j’th class of faults. 
£ ( (n/n-l, Hp is the estimation of the i’th output given all the 
information up to n- 1 ’th sampling under the hypothesis U y The 
values of e^ are calculated at each step using the most recent 
estimate of the fault parameters and the statistics of e,j are used 
to accept or reject the hypothesis. 

Fault Detection and Diagnosis Logic 

This module examines all the estimated fault parameter 
values and the statistics of the residual vectors and generates a 
conclusion as to the fault status of the system. This is done by 
i) comparing statistics of the residual vectors against preselect- 
ed thresholds, 2) comparing the fault parameters against 
predetermined signatures, and 3) comparing the relative 
magnitude of the statistics of the residual vectors among all the 
hypothesis testing modules. By examining the relative magni- 
tudes of the residual vectors from the different hypothesis 
modules we are able to detect the fault, classify the fault type, 
and estimate it magnitude. For example, when operating with 
an actuator fault, it is expected that the magnitude of the 
residual generated by the first hypothesis module (assuming an 
actuator fault) will be significantly smaller than those generated 
by other hypothesis modules. Also, the estimated fault 
parameters F a and f ao will give the indication of the type of 
actuator faults. 

Once a fault is detected, it may be isolated to the 
component that has failed by comparing the fault parameters 
with the known signatures of the fault modes. Measures can 
then be taken to compensate for the fault through reconfigura- 
tion [21. This diagnosis-induced accommodation includes both 


hardware actions (e.g., activating back-up systems) and 
software tasks (e.g., adjusting the feedback control appropriate- 
ly, or estimating the measurement of a failed sensor). The 
diagnostic and monitoring tasks may be carried out by an on- 
board processor, on-line and in real-time, as well as an off-line 
processor which analyzes recorded data for life cycle analysis 
and preventive maintenance. 

AN EXAMPLE: 

FAULT-DETECTION AND DIAGNOSIS OF THE SSME 

The fault-detection and diagnosis (FDD) system based 
on fault parameter estimation, developed in this study was 
applied to the detection and diagnosis of the actuator and 
sensor faults for the space shuttle main engine (SSME). A 
linearized model of the SSME nominal operation is given in 
[9,10]. A piecewise linear model which covers a wide range of 
operation was developed in ] 1 1 1- The system parameters 
developed in [10,1 1] is used as a priori knowledge for the FDD 
system. 

The signature of a fault mode can usually be obtained 
through the analysis of physical property or empirical data. In 
the Space Shuttle Main Engine study, the commonly observed 
actuator faults can be classified into four types: valve ball seal 
leakage or crack, valve line blockage, stuck valve and loss of 
rotational variable displacement transformer (RVDT) signals 
[8]. A ball seal leakage may cause increased flow rate through 
the valve for the same actuator input, causing the fault vector 
parameter f no to have a nonzero component associated with the 
faulty valve° The value of this nonzero element yields the 
amount of leakage. A shaft seal leakage may cause a dia- 
phragm rupture and consequently a stuck valve. This would 
cause those elements of F, and f no associated with the faulty 
valve to change from a value of one to a value of zero and 
from a value of zero to a nonzero value respectively. A broken 
wire in the RVDT system may lead to a signal error, causing 
the valve to continuously increase its opening until it is fully 
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open. Table 1 describes part of these fault signatures, i.e., it 
gives the values of the fault parameters corresponding to each 
signature as well as the values for some combinations of these 
faults. 


A complete nonlinear digital transient model (DTM) of 
the SSME was developed by Rocketdyne Division of Rockwell 
International Corporation [3 2]. This nonlinear simulation is 
used to simulate the SSME dynamic responses for nominal 
operation and fault conditions. The inputs of the simulation are 
the positions of the oxidizer prehumer oxidizer valve (OPOV), 
and fuel prebumer oxidizer valve (FPOV). The measured 
simulation outputs are the chamber inlet pressure (Pc) , mixture 
ratio (MR), high pressure fuel turbine speed (SF2), and high 
pressure oxidizer turbine speed (502). The operating condition 
selected for study is at 100% rated power level with nominal 
mixture ration of 6.026. A closed loop control (PI controller) 
in the DTM simulation is also active to simulate the actual 
operation. The sampling time of the system identification is 
0.04 second. Pseudo random binary sequences (PRBS) with a 
magnitude of 1% of the control command are superimposed on 
the command signal. A recursive parameter identification 
scheme is used to identify' the fault parameters for all the cases 
below. In all the following cases, the simulation was started 
from steady state. 

Case I : OPOV Stuck at Time - LO Second 

Figure 2 shows a case in which the OPOV stuck at time 
at 1.0 second. In this case the valve stopped responding to the 
input command. The expected parameter values for this type 
of fault are F tt l i = 0, F a 22 = 1, f ao l = C lslas (the magnitude of 
bias depends on the valve stuck position and the desired 
position of the operating condition) and f ao 2 = 0. Terminology 
used to label fault parameters are F a (l,l) = F a l I, f ao (I) = f B J, 
etc. The simulation shows that the diagnostic system is not 
only able to identify the correct actuator fault type after the 
initial transient but also able to estimate the magnitude of the 
bias due to the fault which can be very important in designing 
the control accommodation for the fault. Figure 3 shows the 
on-line calculated residual defined by equation (10) under the 
hypothesis of an actuator fault. Values of the residual vector 




Time 

Figure 3 Residual Vector Computed by the Module Hypothesizing 
an Actuator Fault in Case 1 


return to approximately zero after the initial transient. Figure 
4 shows the residual values calculated by the module which 
hypothesizes system degradation faults. In this figure, the 
residual vector elements are at least ten times higher than those 
in figure 3. Similarly large residual values were computed by 
the third module. It can be seen that these values can be used 
to test the validity of the hypothesis modules. 
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Figure 4 Residua! Vector Computed by the Module Hypothesizing 
a System Degradation in Case 1 
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Figure 5 Estimated Fault Parameters in Case 2 


Case 2: OPOV Ball Seal Leak at Time = 7.0 second 

Figure 5 shows a case where an OPOV ball seal leak 
occurred at time 1 .0 second. During the steady state operation 
before the fault occurs, the fault parameter estimates are F a = 

I and f au I =0 as expected. After the fault, the parameter values 
are estimated at F a = 1, f ao l = 2 (%) and f 0O 2 = 0. The simula- 
tion shows that the diagnostic system is able to identify the 
correct parameter after the initial transient. 

Case 3: FPOV Line Blockage at Time = 1.0 second 

Figure 6 shows a case in which the FPOV line became 
blocked at tune 1.0 second. The fault parameter estimates start 
at the correct values of F a = I and f ao l = 0 prior to 1.0 second. 
After the fault, the parameter values are estimated at F a = I, f ao l 
= 0 and f ui> 2 = -2. The simulation shows that the diagnostic 
system is able to identify the correct parameter after the initial 
transient. 

Case 4: Simultaneous OPOV Leakage and FPOV Blockage 

In this case, both Case 1 and 2 faults were introduced 
at the same time (T = 1.0). The final true parameter values in 
this case are F a = I, f ao I = +2.0 and f ao 2 = -2.0. Figure 7 shows 
that the proposed hypothesis testing module is able to correctly 
estimate the fault parameter values within 2 to 3 seconds. 

Case 5: Bias in Chamber Pressure (Pc) Sensor 

Figure 8 shows the results obtained for the case of a 
faulty sensor with a bias of 1% on sensor one (chamber 
pressure). As expected, the results are that the estimated fault 
parameters F g = I and the bias terms f so ~ 0 except f so I which 
is the indicator of Pc measurement bias. 

As illustrated in these simulation results, both the fault- 
detection and the estimation of the extent of faults can be 
detennined by using the proposed approach. These simulations 
indicate that a duration of two seconds is sufficient for the fault 
detection and diagnosis. 

CONCLUSION 

A fault-detection and diagnosis system based on 
distributed, fault -parameter estimation is developed. Actuator, 
sensor and system degradation fault modes are considered by 
the developed FDD system. In the FDD system, the system 
inputs and outputs are first processed by a series of hypothesis 
testing modules. Each hypothesis module generates estimates 
of selected fault parameters and corresponding residuals. The 
fault parameters and residuals generated by the hypothesis 
modules are used for fault -detect ion and diagnosis. The 
proposed FDD system is demonstrated by applying it to detect 
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Figure 6 Estimated Fault Parameters in Case 3 
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Rgure 8 Estimated Fault Parameters in Case 5 
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actuator and sensor faults added to a simulation of the Space 
Shuttle Main Engine. The simulation results show that the 
proposed FDD system can adequately detect the faults and 
estimate their magnitudes. Further research in the application 
of this scheme to system degradation faults is currently 
underway. 
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